DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #1828

NoahStapp · 2025-08-18T15:11:57Z

Please complete the following before merging:

Update changelog.
Test changes in at least one language driver.
Test these changes against all server versions and topologies (including standalone, replica set, and sharded
clusters).

Python Django implementation: mongodb/django-mongodb-backend#366.

Jibola · 2025-08-19T14:26:58Z

source/benchmarking/odm-data/flat-models/small_doc.json

@@ -0,0 +1 @@
+{"field1":"miNVpaKW","field2":"CS5VwrwN","field3":"Oq5Csk1w","field4":"ZPm57dhu","field5":"gxUpzIjg","field6":"Smo9whci","field7":"TW34kfzq","field8":55336395,"field9":41992681,"field10":72188733,"field11":46660880,"field12":3527055,"field13":74094448}


Jibola · 2025-08-21T20:22:32Z

source/benchmarking/odm-benchmarking.md

+
+### Benchmark Server
+
+The MongoDB ODM Performance Benchmark must be run against a standalone MongoDB server running the latest stable database


I think we can open up this to be a standalone or a replica set with a size of 1. (This is because some ODMs leverage transactions)

Using a replica set of size 1 makes more sense here, agreed.

Jibola · 2025-08-21T20:52:39Z

source/benchmarking/odm-benchmarking.md

+
+### Benchmark placement and scheduling
+
+The MongoDB ODM Performance Benchmark should be placed within the ODM's test directory as an independent test suite. Due


I still think we should leave an option for folks to create their own benchmarking repo if that helps out. I'm open to others take on this one seeing as I worry about maintainers not wanting a benchmark repo.

Jibola · 2025-08-21T20:53:03Z

source/benchmarking/odm-benchmarking.md

+to the relatively long runtime of the benchmarks, including them as part of an automated suite that runs against every
+PR is not recommended. Instead, scheduling benchmark runs on a regular cadence is the recommended method of automating
+this suite of tests.
+


Per your suggestion earlier, we should include some new information about testing mainline usecases.

NoahStapp · 2025-08-21T21:22:26Z

source/benchmarking/odm-benchmarking.md

+As discussed earlier in this document, ODM feature sets vary significantly across libraries. Many ODMs have features
+unique to them or their niche in the wider ecosystem, which makes specifying concrete benchmark test cases for every
+possible API unfeasible. Instead, ODM authors should determine what mainline use cases of their library are not covered
+by the benchmarks specified above and expand this testing suite with additional benchmarks to cover those areas.


This section is attempting to specify that ODMs should implement additional benchmark tests to cover mainline use cases that do not fall into those included in this specification. One example would be the use of Django's in filter operator: Model.objects.filter(field__in=["some_val"]).

dariakp · 2025-08-29T18:37:44Z

source/benchmarking/odm-benchmarking.md

+### Benchmark Server
+
+The MongoDB ODM Performance Benchmark must be run against a MongoDB replica set of size 1 running the latest stable
+database version without authentication or SSL enabled.


Are we concerned at all about accounting for performance variation due to server performance differences? In the drivers, we keep the server version patch-pinned and upgrade rarely and intentionally via independent commits in order to ensure that our performance testing results are meaningful and are only reflective of the changes in the system under test (the driver, or, in this case, the ODM). If the goal is only to track the performance of ODMs relative to each other and relative to the corresponding drivers, is the intention to have the drivers also implement these tests against the latest server so that we could get that apples-to-apples comparison?

Are we concerned at all about accounting for performance variation due to server performance differences?

From the Django implementation:

This is NOT intended to be a comprehensive test suite for every operation, only the most common and widely applicable

@NoahStapp and @Jibola are working on this project for DBX Python (although I am reviewing the implementation PR), so this is just a drive by comment from me, but my impression is that the spec is at least initially focused on getting all the ODMs to agree on what to test.

In the drivers, we keep the server version patch-pinned and upgrade rarely and intentionally via independent commits in order to ensure that our performance testing results are meaningful and are only reflective of the changes in the system under test (the driver, or, in this case, the ODM). If the goal is only to track the performance of ODMs relative to each other and relative to the corresponding drivers, is the intention to have the drivers also implement these tests against the latest server so that we could get that apples-to-apples comparison?

One more drive by comment: I'd expect each ODM to "perform well" under similar server circumstances (testing the driver is a good call out!) but I'm not sure apples-to-apples is the goal. If other ODMs test their performance using the spec and can demonstrate "good performance" and/or catch performance issues they would otherwise have missed, that would indicate some measure of success to me in the spec design.

I chose latest stable server version here for the following reason: we've made server performance an explicit company-wide goal. When users experience performance issues on older server versions, one of the first things we recommend is that they upgrade to a newer version. At least in the Python driver, we only run performance tests against 8.0. Using the latest stable version ensures that our performance tests always take advantage of any server improvements and isolate performance issues in the ODM or underlying driver.

Implementing these same tests in the driver for a direct apples-to-apples comparison is a significant amount of work. Several of the tests here use similar datasets as the driver tests for easier comparison, so using the same version of the server as the driver tests to reduce differences could be useful.

Using the latest stable version ensures that our performance tests always take advantage of any server improvements and isolate performance issues in the ODM or underlying driver.

I think we should be careful about our goals here: if it is to take advantage of any server improvements and track performance explicitly relative to the most current server performance, then this approach is fine. However, this approach will not isolate performance issues in the ODM or driver because: 1) server performance is not guaranteed to always improve in every release for every feature: the overall trends of the server performance for most features will hopefully keep moving up, but between releases there may be "acceptable" regressions to certain features that are considered a tradeoff to an improvement in another area, and 2) server performance improvements could mask ODM regressions that happen concurrently with the server upgrade. We should be explicit about accepting both of these risks if we are going to move forward with this approach (i.e., note this somewhere in the spec text).

NoahStapp added 6 commits August 12, 2025 16:37

DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations

87ddf0a

Note direct comparisons for model creation tests

0520372

Remove small nested benchmarks

f44d5a9

Update instructions for foreign key test

293443a

Add datasets

3b25a27

More specific wording and benchmark phases

c1207dd

NoahStapp requested a review from Jibola August 18, 2025 15:11

NoahStapp mentioned this pull request Aug 19, 2025

DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations mongodb/django-mongodb-backend#366

Draft

Jibola reviewed Aug 21, 2025

View reviewed changes

NoahStapp added 2 commits August 21, 2025 14:17

Add ODM-specific testing blurb

38b26bf

Use replica set of size 1

c07cd08

NoahStapp commented Aug 21, 2025

View reviewed changes

NoahStapp marked this pull request as ready for review August 21, 2025 21:22

NoahStapp requested a review from a team as a code owner August 21, 2025 21:22

NoahStapp requested review from JamesKovacs, alexbevi, aclark4life, ajcvickers, rozza, damieng and R-shubham and removed request for a team August 21, 2025 21:22

rozza removed their request for review August 26, 2025 08:55

dariakp reviewed Aug 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #1828

DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #1828

Uh oh!

NoahStapp commented Aug 18, 2025

Uh oh!

Jibola Aug 19, 2025

Uh oh!

Jibola Aug 21, 2025

Uh oh!

NoahStapp Aug 21, 2025

Uh oh!

Jibola Aug 21, 2025

Uh oh!

Jibola Aug 21, 2025

Uh oh!

NoahStapp Aug 21, 2025

Uh oh!

dariakp Aug 29, 2025

Uh oh!

aclark4life Aug 29, 2025 •

edited

Loading

Uh oh!

NoahStapp Sep 2, 2025

Uh oh!

dariakp Sep 2, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1 @@
		{"field1":"miNVpaKW","field2":"CS5VwrwN","field3":"Oq5Csk1w","field4":"ZPm57dhu","field5":"gxUpzIjg","field6":"Smo9whci","field7":"TW34kfzq","field8":55336395,"field9":41992681,"field10":72188733,"field11":46660880,"field12":3527055,"field13":74094448}


		### Benchmark Server

		The MongoDB ODM Performance Benchmark must be run against a standalone MongoDB server running the latest stable database


		### Benchmark placement and scheduling

		The MongoDB ODM Performance Benchmark should be placed within the ODM's test directory as an independent test suite. Due

DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #1828

Are you sure you want to change the base?

DRIVERS-2917 - Standardized Performance Testing of ODMs and Integrations #1828

Uh oh!

Conversation

NoahStapp commented Aug 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aclark4life Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aclark4life Aug 29, 2025 •

edited

Loading